Skip to content

Refactor Error Handling#16

Open
aj-bagwell wants to merge 8 commits intolthiery:mainfrom
aj-bagwell:error-handling
Open

Refactor Error Handling#16
aj-bagwell wants to merge 8 commits intolthiery:mainfrom
aj-bagwell:error-handling

Conversation

@aj-bagwell
Copy link
Contributor

Refactor the error handling:

  • Split the Error enum in to recoverable client errors and unrecoverable socket errors.
  • Return more errors to client methods instead of crashing the runner
  • Improve Scan errors instead of leaving clients hanging when scan fails
  • Make run take self as ref so that the socket if errors happen caller can reconnect without invalidating client
  • Rely of drop breaking channels instead of explicitly informing client that runner is stopping

Add helper methods for reading a line from the socket, and for sending a
request and waiting for the response.
It adds a debug log message if there is no listener for the boradcasts.
This is to allow the runner to keep going if there is an issue with any
single response from the wpa_supplicant.

IO errors indicate that the socket is broken so still go to the worker
so it can be reconnected.
Sometime the scan request will return OK but the WiFi driver will fail to start a scan.
This results in a scan failed event. The request then needs to get an error to aviod
hanging forever.
This splits the error type into two:
SocketError: for anything IO related
Error: for anything user/protocol related

Socket errors go stop the runners which can then reconnect or what ever
other apropriate action is needed

Other errors go to the client caller who can try and correct the
malformed inputs.

I also reorganised and simplified the errors.
The channel was complicating error handling and not adding much as both
sides of the channel were going into the same select!() call.

The channel for broadcast requests remains.
@aj-bagwell
Copy link
Contributor Author

Hi,

Sorry for taking so long to post this follow up pull request with my changes to the error handling.

The first four commits (up to when I split the Error enum ) should be backwards compatible, the last two ("Split Error enum" and "Remove event Channel") are where the big changes are.

I'm posting this more as a starter for discussion to see if you are still interested in the idea. If you don't like it, just say so and it can go straight to /dev/null.

I'm not sure I like how much I am still leaking internal details e.g. in ParseError or SocketError::ClientChannelClosed.

I am also not sure about how to treat SelectResult as returning Ok(SelectResult::InvalidNetworkId) feels a bit like "Task failed successfully". On one hand I am tempted to a RunnerNotRunning or something to SelectResult then fn select_network would not need to wrap it in a result. On the other I am tempted to move PendingSelect InvalidNetworkId and Timeout into ClientError. InvalidNetworkId should apply to more things (e.g set the netwokr ssid) but wpa supplicant just returns "FAIL"

And finally there is the broader issue of timeouts, my gut feel is that there are still ways in which the clients can hang indefinitely, and every request should have a similar timeout mechanism to select_network.

Only strip the trailing new line on reponses before parsing
The scan results are tab separated and if the ssid is empty it can end
with a tab which breaks the parsing if removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant